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A UNKING CROSS BAR CbNTROLLER 
Eltan MEDINA and David SHEMLA 
FIELD OF THE INVENTION 
The present Invention relates to network switches generally and to 
cross-bars In particular. 

BACKGROUND OP THE INVENTION 
A network switch creates a network among a plurality of end nodes, 
such as workstations, and other network switches connected thereto. Each end 
node is connected to one port of the network. The po.ls also sen/e to connect 

network switches together. 

Each end node sends packets of dato to the network switch which the 
switch then routes either to another of the end nodes connected thereto or to a 
network switch to which the destination end node Is connected. In the latter case, 
the receiving networic switch routes the packet to the destination end node. 

Each network switch has to temporarily store the packets of data which 
It receives from the units (end node or network switch) connected to It while the 
swrtch determines how. when and through whteh port to retransmit the packets. 
Each packet can be transmitted to only one destination address (e "unlcasr 
packet) or to more than one unit (a "multicast" or "broadcast packet). For 
multicast ar«d broadcast packets, the switch typically stores the packet only once 
and transmits multiple copies of the packet to some (multicast) or all (broadcast) 
of Its ports. Once the packet has been transmitted lo all of its destinations, it can 
be removed firom the memory or written over. 
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Switching EUiemet Controllera (SECe) are network switches that 
implement the Ethernet switching protocol. According to the protocol, the 
Dhernet network (cabling and bthemst ports) operales at 10 Megabits per 
second. Switches which operate at the desired speed of 10 Mcgabite per second 
are known as providing "full-wire" throughput. ^'^ ^ ^ 

l^^ln US patent application 08/790,155? filed^nuary 28. 1997. and 
incorporated heroin by reference, in order to/bPtirnKe through-put time, 
communication between SbCs attempts to iMi£e the bus as little as possible eo 
that the bus will be available as soon as a^C wants to utilize it. Therefore, each 
10 SEC Includes a wrtte-only bus comtmiKicsjtion unit which transfers the packets out 
of the SEC by only writing to the^s. Thus, packets enter eacli SEC by having 
been written therein from ojKer SF.Cs and not by reading them in. since read 
operations utilize the bk/for significant amounts of timo compared to write 
operations. Having ITie bus available generally whenever a SEC needs It helps to 
IS provide the full-y^e throughput. 

However, when many SFCs write ta the same bus, the throughput is 

limKed by the speed of the bus. 
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SUMMARY OF THE PRESENT INVENTION 

It is an object of the present invention to provide a cross-bar for 
communidating between Switching Etiiemet Controllers (SEC) and PCis. 

There is Uierefore provided in accordance with a preferred embodiment 
s pf the present Invention a data networl< Including at least one crossbar, wherein 
each crossbar comprises N ports and a plurality N of devices each associated 
with ahd connected to one port of one of said crossbars. Each on© port of one 
crossbar Includes an input buffer, a plurality N-1 of port output buffers, a 
plurality N-1 of fullness sensors, shutoff means 
10 The input buffer receives messages from the device connected to Its port 

and sends said messages to the other ports of said one crossbar. Each port 
output buffers corresponds to one of said other ports, wherein each port output 
buffer receives said messages only from said input buffer of Its associated other 
port. Each fullness sensor is associated with one port output buffer and 
15 measures the fullness state of its associated port output buffer. 

The shutoff means connects to the fullness sensors associated with the 
port output buffers corresponding to said one port at said N-1 other ports, for. 
when said fullness state for one of said other ports Is generally full. Indicating to 
said device connected to said one port not to send data for the port which Is 

20 now generally full. 

There Is theiefore provided a network wherein each device additionally 
includes N-1 device output buffers, one per the N-1 other ports of said crossbar. 

Additionally, there is therefore provided a network wherein each port 
comprises a bus link connected to said corresponding associated device. 
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Moreover, each device also inoludas a multiplicity of direct mftmory 
aoceee (Di^) units for removing data from ar least one of said device output 
butters. 

Furthermore, each crossbar In the network Includes an arbiter for 
5 providing said messages from said N-l"port output buffers to said device 
connected to Its port only if said device is not full. 

There is therefore provided In accordance with a preferred embodiment 
of the present Invention a switcli for a data network, the data network ^ncluding 
at least o(»e crossbar having N ports and the switch being connectable to one of 
10 said N ports. The switch includes a multiplicity of switch output buffers, one per 
ft^k the N*1 other ports of said crossbar and at least two direct memory access 
(DMA) units, each associated with at least one of said switch output buffers, for 
removing data from said associated at least one switch output buffers. 
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BRIEF DESCRIPTION OF THE DRAWINGS 
The present invention will be understood and appreciatftri more fully 
from thft following detailed description taken In conjunction witfi the drawings in 
which: 

) rig. 1 Is a block diagram lllustrafion of a network of swHching 

communication controllore, constructed and operative in accordance with a 
prefen-ed embodiment of the present invention; 

Fig. 2 Is a schematic iliustrdtion of a switching communication cross bar 
controller fonning part of the network of Fig. 1 ; 
10 Fig, 3 Is a schematic illustration of a portion of the cross bar of Fig. 2 

and an Ethernet switching unit forming part ot the switching network of Fig. 1 : 

Fig. 4 Is a schematic Illustration of the switching unit and link connection 
fomiing pai I uf the switching network of Fig, 1 ; 

Fig. 5A is a flow chart illustration of the networking communication in 
IS accordance with a preferred embodiment of the present invention; 

Fig. 5B Is a illustration of a local link communication in accordance with 
switching Ethernet controller network of Fig. 1 ; 

Rg. 6A is a schematic diagram of an interface bue; 

Fig. 6B is a block iliusti^ation of a message architecture used In the 
20 interface bus of Fig . 6A; 

Fig. 6C Is a timing diagram illuslraliori of the activity of the bus during the 
operations of Fig, OA; and 

Fig. 7 is a block diagram illustration of the logical elements of a link 
message used In the present Invention; 

25 
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DETAILED DESCRIPHON OF THE PRESENT INVENTION 

* 

Reference is now made to Rgs. 1, 2, 3 and 4 which illustrate, in general 
terms, a cFoss bar 10 of the present invention and its connection within a network, 
wherein Fig. 1 Illustrates a general oven/iew of one or more cross bars 10 

5 connected via one or more Individual buses 14 to one oc cnore Ethernet switches 
12. Since each Ellieniel switch 12 is connected to the cross-bar 10 via its own 
individual bus 14, the cross bar 10 typically provides linking operatioris, 
transferring data from switch to switch. The one or more Ethernet swltches'12 are 
typically connected to one or more devices or work stations, not shown in the 

10 figures. 

As further illustrated in Fig. 1. the network switches 12 Interconnect to 
create a large network or to enlarge an existing network. A plurality of network 
switches 12 are connected to PCI busses which are connected through 
PCI-to-PCI bridges. Thus, two bus networks can be connected together through 
IS tlie addition of another PCI bus and two PCI-to-PCI bridges. 

Figs. 2, 3 and 4 illustrate the network of Fig 1, and epecifically cross bar 
10 and elomsnts of the associated switches 12, In some detail. In order to 
facilitate understanding, switches 12, along with other similar type elements, have 
been alphabetized to indicate location or sequence in the network. This 
■20 numbering Is for explanation only. In addition, in order to facilitate understanding, 
arrows have been added to represent data packet flow; however, tor clarity, not 
every data flow path has been mapped. 

As shown In Fig. 2. cross bdr 10 comprises a multiplicity of ports 16 
where each port 16 comprises a link logic unit 18, an input FIFO buffer 20, a 
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plurality of output buffers 22 and a port transmit arbiter 26. Typically cross bar 10 
has four ports 1 6; however, it may comprise any number of ports 1 6. 

Each port 16, and the elements whidi il ooinprises. are dedicated to an 
associated swilcli 12 or cross bar 10, and are responsible for ell communication 
with its associated switch 12 or crossbar 10'. As an example, the elements which 
comprise port 16A are associated with switch 12A and are responsible for switch 
12A's communication: link logic unit 18A receives and directs messages and data 
from switch 12A. and performs various port functions which will be described in 
more detail hereinbelow; input FIFO buffer 20A receives and stores data pacl<ets 
sent from switch 12A: output buffers 22 at port 16A receive and store data 
pad^ets sent to switch 12A: and. port transmit arbiter 26A sends messages and 
data to switdi 12A. 

Port to port communication is made via point-to point connection 
between input buffers 20 and their associated output buffers 22. As an example, 
15 and as represented in Fig. 2 by solid lines '/S, input buffer 20A Is connected to , 
output buffers 22A. Output buffers 22A are located at ports lOB. 16C and 16D, 
respectively. 

Input buffer 20A transfers data pacl<et6 via output buffers 22A to ports 
16B, 16C and 16D, Additionally, though not represented in the Figures. Input 
20 buffer 20B is connected to output buffere 22B. located at ports ICA, 1 8C and 1 6D, 
and so on- This poirU-lo-point connection allows simultaneous non-collision data 
transfers from the input buffers 20 to their dedicated output buffers 22. and hence 
simultaneous communication between ports 16. 

Communication starts«5 at switch 12. which writes a lini^ message and/or 
35 an accompanying data packet into its associated link logic unit 18. Link messages 
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are either sent alono or interleaved with data packets. Two major typfi55 of link 
messages are possible; svwtch link messages to be transferred to anulfier switch, 
and local link messages indicating Uie status of the associated switch 12, Switch 
link messages transferred with an accompanying data packet contain Information 
concerning the assodetcd data packet 

Logic unit 18 reads the first bit of the link message, which Identifies the 
type of link message being transmitted. As an exatnple. if the first bit is set or 1, 
as an example. Uien logic unit 18 recognizes the message as a local link 
message, does not transfer the message, and proceeds to perform port functions 
which will be described in more detail hereinbelow. If the first bit is not set or Is 0, 
then the logic unit 18 recogni7es the link message as being a switcti link message 
and transfers the message and the associated packet to the input buffer 20. 

As an example, switch 12A sends o link message and data packet to 
logic unit ISA; logic unit 18A identifies the link message as a switch link message 
and transfers both the switch link message a'nd the data packet to Input buffer 
20A. 

Via Uie direct point-to-point connections noted hereinabove, the input 
buffer 20 on port 16 transfers the link message' and data packet to its associated 
output buffers 22 located on the other ports 16 on cross bar 10. As an example, 
input buffer 20A transfers the message and packet to output buffers 22A located 

on ports 16B. 16C and 16D. 

Each port 16 Is identified by a port address, and each device, connected 
to one of the switches, is identified by Its own device address or number. Each 
output buffer 22 comprises a device table register 24, coupled thereto. When port 
16 is linked to a switch 12. register 24 logically holds the device number of a 
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device linked to that specific port 16. When port 16 is Ilnl<ed to a cross l>ar 10. 
register 24 logically holds the device numbers of all thft devices connected to the 
cms^s bar 10 linked ro that spedflc port 16 

The device number is also logically contained within the switch link 
message. When the switch link message land data packets are received at the 
port 16, the packet is first received by the device table register 24. If the device 
number data in the switch link message Is Included in the receiving register 24, 
the message and packet is wrillen Into tlie coupled output buffer 22; if the device 
number data in the switch link message is not included in the receiving register 
24, then the register 24 simply Ignores the message, and does not receive it. 

As an example, the switch link message and its associated data packet 
of the previous example are Intended for transfer lo a device linked to port 16B. 
Output buffeis 22A located on ports 16D, 16C and 16D receive the message and 
the packet. The registers 24A at ports 16C and 16D do not include the device 
number contained in the switch message, and thus ignore the message. The 
register 24A coupled to output buffer 22A at port 16B does include the device 
number, and hence writes llie message and packet Into output buffer 22A at port 
1CB. Output buffer 22A at port 16B then transfer© the message and packet to the 
port transmit arbiter 26B, which transfers the message and packet via link 14B to 
switch 12B, and onto its eventual destination. 

It is common that some switches 12 and their dedicated ports 16 are 
busier than others. For example, if switch 12A is excessively busy, the memory in 
switch 12A may become too full to receive more data packets. In such a case, 
the present invention implements flow control which ensures thai no data is lost 
during the temporary backup. 
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In this situation, if switch 12A is full, ft notifies such to logic unit 18A. 
Logic unit 18A then signals to port^ansmit artiiter 26A to cut off the data flow via 
llnk>14A to switch 12A. As a resuIL switcli 12A does not receive any more data 
packets, however, it does continue to send link messages and/or data packets to 

5 Input buffer 20A. Once the memory in switch 12A is cleared, switch 12A sends a 
message to logic unit 18A to re-open up the link 14A and re-allow data packet 
transfer to the switch. 

Wiiile Uie link 14A to switch 12A is closed, all data packets sent to 
switch 12A are temporarily stored in the output buffers 22B, 22C, and 22D, 

10 located at port 16A. If port 16A continues to be busy, the temporarily stored data 
messages may backup in one of the output buffers 22 at port 16A. as an 
example, output buffer 22D at port 16A (Fig. 3). - 

Each output buffer 22 comprises an elmost full threshold 23, which 
when crossed, signifies that the output buffer 22 has become alrppst tyll and can 

13 not receive anymore data packets. Similarly, each output buffer 22 comprises an 
almost ^pty thre.<thold 21, which when crossed, signifies Uial output buffer 22 
has become almost empty and can again receive data packets. 

In accordance with a prefen-ed embodiment of the present invention, 
and ae per the example from above, the data is output buffer 22D crosses almost 

20 full threshold 23. As indicated in Fig. 3 by dashed arrows 29. the almost full 
output buffer 22D (at port 16A) sends a message to the port transmit arbiter 26D 
(at port 16D). The message notifies port transmit arbiter 26D that switch 12 A is 
almost full and requests that port 16D stop sending data to switch 12A. 

Referring now to Fig. 4. each port transmit arbiter 26 conuiiunicates wKh 

2S Its associated switch 12 . via link 14 and an associated switch arbiter 28. When 
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poit tranerttit iiibitdr 26 receives a signal to cut oft outgoing data packets. It 
notififtft Rudi to switch arbiter 28. Which partially halts the oulward flow of data 
packets; when poil transmit arbiter 26 receives a signal to reopen communication, 
it notifies such to ewitch arbiter 28, which then reopens outward flow. 

As per tha above example, port transmit arbiter 26U notifies switch 
arbiter 28D of switch 12U (arrow 27) to stop transferring data to switdi 12A. 
Arbiter 28D stops transfer of data to swildi 12A, however, still transfers data to 
other switches, as indicated by arrows 25. 

When the temporarily stored data packets at output buffer 22U (at port 
16A) have cleared-out. and have crossed the alma<5r empty threshold 21. the 
output buffer ??n notifies sends a message to bar arbitei 26D (at port 16D) 
notllying It that swildi 12 A is almost empty. Port transmit arbiter 26D resumes 
sending data packets to port 16A. 

For the purpose of temporarily storing data packets wrtiile cross bar 
aifciter 25 has halted traffic, and as shown in Fig. 4, each switch 12 has a plurality 
of Direct IVIemory Access (DMA) units 30 and associated swrtch FIFO buffers 32. 
Eadi DMA 30 is responsible for transfer of data from its associated FIFO buffer 
32. 

When port transmit arbiter 26n notifies arbiter 28D of switch 12D to stop 
sending data packets to switch 12A. switch arbiter 28D so Indicates to the DMA 
unit 30A. DMA unit 30A stops transferring data from FIFO buffer 32A to switch 
12A, However, the remaining DMA units 30 remain active, sending data through 
port 16D to the other switches 12. 

While the outgoing data traffic from switch 12 Is closed, data packets 
Intended for transfer are temporarily stored in switch buffers 32. As rioted above, 
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eaeh eii/Mi 30 linked to^ a^ associated switch buffer 32. Additionally, each DMA 
30 is dedicated to one or more different ports 16. Generally each DMA 30 Is 
dedicated to 2 ports, as example, ports 16A and 16D. Iiuwever, DMA 30 cuuld be 
dedicated to any number of ports depending on the switch's load. 

When arbiter 28 receives a message to close oH outflow of data from 
switch 12 to an almost full port 16. only the outflow from the DMA 30 a.<uu)ciated 
with the "full" port 10 is cut off. The DMA's 30 associated with the other ports 16 
remain active. Consequently, the only affected data flows are those of the 
associated ports and npt the entire outgoing data flow from ©witch 12. 

Hence, In contrast to prior art network systems which required a total 
data flow halt upon collision from bacKups, the flow control system of the present 
invention allows data flow to continue, and stems only that traffic affected by tlie 
backup. 

Reference is now made to Figs. 5A and 5B illustrating the 
communication flow In the cross bar 10 among source switch 12A. port 16A on 
cross bar 10, port 16B on cross bar 10 and destination swildi 12B, ^uinriiandng 
the flow discussed hereinabove. 

Switch 12A transfers (step 110) a link message and a data packet to 
logic unit 18A. The logic unit 18A Identifies (step 112} the link message as a 
switch link inei«>aye and writes (step 114) the link message and data packet Into 
input buffer 20A. 

The input buffer 20A transfers (step 116) the link message and data 
packet to the registers 24A at ports 16B, ISC and 16D. The registers 24A at 
ports 16B. 16C and 16D receive (step 118) the link message and the data packet 
transfen-ed from switch 12A. Registers 24A at ports 16C and 1GD do not 
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recognr7e (step 120) the device number in the link message and ignore the 
message and the data packet. The register 24A at port 16B, recognizes (sjep 
122) the device number in the link message, and the link message and data 
packet are written (step 1 24) into the output buffer 22A at port 1 6B. 
5 If switch 12B i$ not full, output bliffer 22A at port 166 transfers (step 

126) the data to porr transmit arbiter 26B, which transfers (step 128) the message 
and data via link 14B to switch 12B. 

If switch 126 becomes full, it transfers (stop 130, Fig. 66) a .local (ink 
message to logic unit 186 indicating such. Logic unit 186 indicates (step 132) to 

10 port tr^insmit arbiter 2f5R to stop sending any messages and/or data to switch 126. 
Output buffer 22A at port 16B fills up (step 134) with temporarily stored 
unsendable messages and data. 

When output buffer 22A at port 165 reaches (step 136) almost fullness 
threshold 23. it notifies (step 138) such to port transmit arbiter 26A at port 16A. 

15 Arbiter 26A notifies (step 140) switch arbiter 2aA at switch 12A not to send 
anymore data to port 1GB until further notice. Switch 12A indicates (step 142) 
to its DMA 306 (on switch 12A) to stop sending to switch 126, and switch buffer 
32B (on switch 12A and dedicated to port 16B) temporarily stores (step 143) 
unsendable messages and data. 

20 Wiien switch 12B is capable of receiving again, 11 sends (step 144) a 

local link message to logic unit 186 indicating that it id now open to receive. 
Logic unit 186 indicates (step 146) to port transmit arbiter 26B to reopen inflow to 
switch 12B. Output buffer 22A at port 16B restarts transmission (step 148). 
Wlien the output buffer 22A at port 16B reaches an empty tlireshold 21 (step 150) 

23 it notifies such to port transmit arbiter 26A at port 1 6A. 
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Port transmit arbiter notifies (step 152) switch arbiter 28A at switch 
12A to resume transfer of data to poit 18B. Switch A indic;alei> (step 154) to DIVIA 
308 to resume sending data. Switch buffer 32B (on switch 12A and dedicated to 
port 16B) resumes (step 156) transfer of messages and data to switch 12B. 

Reference is now made to Figs 6A. 6B, 6C and /. Figs. 6A. 6R ^ncl RC 
describe a preferred embodiment of an interface bus used In the Individual bus 14 
of ttie preseril invention; Fig. GA Is schematic diajiram of the interface bus, Fig, 6B 
is a message architecture used in the interface bus, and Fig, 6C is. a timing 
diagram illustration of the activity of the bus during operations. 

Indivldu;)! bus 14 is a 17 bit point-to-point bus and comprises a clock 
signal 210. a command bit signal 212 and 16 bils of data signal 214. The 10 bit 
data 214 transfers either a 16 bit link message or data packets. Fig. 7 is a block 
diagram illustration of the logical elements of data bit 214, 

As noted hereinabove, each link 14 provides the connection and 
communication between one cross bar 10 and one switcli 12, and transfers link 
messages and data packets therebetween. As additionally noted hereinabove, 
link messages 214 are transferred cither alone or accompanied by data packets, 
and switch link messages transferred with data packets are used to communicate 
information about the associated data packet 

Clock 210 functions in a manner similar to network clocks known in the 
art. and as such will not be described in detail herein. Command bit 212 is a one 
cycle command word and data bits 214, depending on the command message, 
comprises between 0 to 33 cycles of 16 bit data words. In acconjance with a 
pieferied embodiment of the present invention, command messages signal the 
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commencement or end of a data packet When not transmitting command 
message, command bit 212 transmits an Idle signal. 

Refemng now to Fig. GC, signals are constantly being transferred over 
links 14. either In the form of commands, idle signals, link messages, or data 
5 packets. The timing signal rises to high with the inttiallzatlon of a command bit 
212. remains at high for the duration of an Idle message, sinks from the high 
signal at Uie wirunedcefnenl of a link message/data packet and remains at low 
throughout the duration of the link message/data packet Hence, the. rising or 
sinking state of the command bit indicates whether to anticipate commands or to 

10 anticipate message/data. 

Referring now to Fig. 7, an exemplary pruluwl fur data packet 214 is 
shown which comprises 16 bits; bit 15 is a link message bit 220, bits 14-11 
provide high address bits 222, bits 10-6 indicate a device number 224, and bits 
5-0 indicate a message type 226, 

IS As noted hereinabove, the link message bit 220 is either set or not set 

and is used to signify either a local link message or a switch link message, 
respectively. The high address 222 Is provided for PCI address mapping and 
used for communication between switches 12 and PCIs connected to the network. 
I he device number 724 id^ntifia<^ the number of the device designated to receive 

20 the link message and/or dalci packet Tiie niet>sage type 226, as described in US 
patent application 08/790,155, which is incorporated herein by reference, relays 
messaging protocol between switches 12. 
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